VISTA: Validating and Refining Clusters via Visualization (final version)

نویسندگان

  • Keke Chen
  • Ling Liu
چکیده

Clustering is an important technique for understanding of large multi-dimensional datasets. Most of clustering research to date has been focused on developing automatic clustering algorithms and cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may incur higher error rates when dealing with arbitrarily shaped clusters. Although some efforts have been devoted to addressing the problem of skewed datasets, the problem of handling clusters with irregular shapes is still in its infancy, especially in terms of dimensionality of the datasets and the precision of the clustering results considered. Not surprisingly, the statistical indices works ineffective in validating clusters of irregular shapes, too. In this paper, we address the problem of clustering and validating arbitrarily shaped clusters with a visual framework (VISTA). The main idea of the VISTA approach is to capitalize on the power of visualization and interactive feedbacks to encourage domain experts to participate in the clustering revision and clustering validation process. The VISTA system has two unique features. First, it implements a linear and reliable visualization model to interactively visualize multi-dimensional datasets in a 2D star-coordinate space. Second, it provides a rich set of user-friendly interactive rendering operations, allowing users to validate and refine the cluster structure based on their visual experience as well as their domain knowledge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VISTA: validating and refining clusters via visualization

Clustering is an important technique for understanding of large multi-dimensional datasets. Most of clustering research to date has been focused on developing automatic clustering algorithms and cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may incur higher error rates when dealing with ...

متن کامل

Validating and Refining Clusters via Visual Rendering

Clustering is an important technique for understanding and analysis of large multi-dimensional datasets in many scientific applications. Most of clustering research to date has been focused on developing automatic clustering algorithms or cluster validation methods. The automatic algorithms are known to work well in dealing with clusters of regular shapes, e.g. compact spherical shapes, but may...

متن کامل

Optimizing star-coordinate visualization models for effective interactive cluster exploration on big data

Interactive visual cluster analysis is the most intuitive way for finding clustering patterns, validating algorithmic clustering results, understanding data clusters with domain knowledge, and refining cluster definitions. The most challenging step is visualizing multidimensional data and allowing a user to interactively explore the data to identify clustering structures. In this paper, we syst...

متن کامل

The L.l. Thurstone Psychometric Laboratory University of North Carolina

This chapter presents ViSta-PrnCmp, the module for Principal Components Analysis (PCA) in ViSta. This procedure is capable of analyzing numerical variables so they can be represented on a lower dimensionality space. The visualization for ViSta-PrnCmp includes a scatterplot-matrix of component scores; a bi-dimensional biplot; a tri-dimensional (spin-plot) version of the biplot; a box-diamond-dot...

متن کامل

VISTA Variables in the Via Lactea (VVV): The public ESO near-IR variability survey of the Milky Way

We describe the public ESO near-IR variability survey (VVV) scanning the Milky Way bulge and an adjacent section of the mid-plane where star formation activity is high. The survey will take 1929 hours of observations with the 4-metre VISTA telescope during five years (2010 − 2014), covering ∼ 109 point sources across an area of 520 deg2, including 33 known globular clusters and ∼350 open cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004